Global Mortality Comparison of Alzheimer’s Disease and Other Dementias

(Alzheimer’s Association, 2019)

Data Origins

This project focuses on the global mortality rate of Alzheimer’s disease, alongside other forms of dementia. The raw dataset was accessed via Our World in Data (World Health Organization, 2025), derived from the original World Health Organisation’s (WHO) Global Health Estimates (GHE) of 2023 (World Health Organization, 2023).

WHO’s GHE examines global death and disability statistics, by region, country, sex, age and cause, per 100,000 people. This dataset focuses on the trends of the mortality rate of Alzheimer’s disease from 2000 to 2021, produced from national vital registration data, latest estimates from WHO technical programs, United Nations partners and inter-agency groups, and the Global Burden of Disease.

References for the raw dataset:

World Health Organization. (2023). Global Health Estimates. Www.who.int. https://www.who.int/data/global-health-estimates

World Health Organization . (2025). Death rate from Alzheimer’s. Our World in Data.

https://archive.ourworldindata.org/20250909093708/grapher/deathratefromalzheimersotherdementias-ghe.html?tab=table


The Beginning of the Project

First you need to load the following packages into the console, I had to use all of these due to operating off of a Linux computer. For Windows or MacOS you probably won’t need all of these packages, just tidyverse, here, ggplot2 and possibly plotly.

library(tidyverse)
library(here)
library(ggplot2)
library(readr)
library(scatterplot3d)
library(plotly)
library(dplyr)
library(viridis)

Next you need to import the raw data set into R and give it a more convenient name.

death_rate <- read_csv("Raw_Data/death_rate.csv", show_col_types = FALSE)

This is a summary of the raw dataset, and a list of the column names, as well as the first few rows of the data.

summary(death_rate)
##     Entity               Year     
##  Length:4422        Min.   :2000  
##  Class :character   1st Qu.:2005  
##  Mode  :character   Median :2010  
##                     Mean   :2010  
##                     3rd Qu.:2016  
##                     Max.   :2021  
##  Death rate from alzheimer disease and other dementias among both sexes
##  Min.   :  0.000                                                       
##  1st Qu.:  4.192                                                       
##  Median :  7.285                                                       
##  Mean   : 15.434                                                       
##  3rd Qu.: 16.270                                                       
##  Max.   :314.530
names(death_rate)
## [1] "Entity"                                                                
## [2] "Year"                                                                  
## [3] "Death rate from alzheimer disease and other dementias among both sexes"

Data Dictionary

  • Entity — Country or global estimate
  • Year — Calendar year (2000–2021)
  • Mortality_rate — Deaths per 100,000 people from Alzheimer’s disease and other dementias
head(death_rate, 10)
## # A tibble: 10 × 3
##    Entity       Year Death rate from alzheimer disease and other dementias amo…¹
##    <chr>       <dbl>                                                       <dbl>
##  1 Afghanistan  2000                                                        4.62
##  2 Afghanistan  2001                                                        4.68
##  3 Afghanistan  2002                                                        4.73
##  4 Afghanistan  2003                                                        4.75
##  5 Afghanistan  2004                                                        4.81
##  6 Afghanistan  2005                                                        4.93
##  7 Afghanistan  2006                                                        5.05
##  8 Afghanistan  2007                                                        4.91
##  9 Afghanistan  2008                                                        4.83
## 10 Afghanistan  2009                                                        4.9 
## # ℹ abbreviated name:
## #   ¹​`Death rate from alzheimer disease and other dementias among both sexes`

The Research Question

How does Alzheimer’s disease mortality rate in the world’s three most populated countries (India, China, United States of America) compare across the 2000–2021 period?

Data Preparation

For this project, I decided to only analyse the responses from the three most populated countries in the world. To reflect this in the clean data, I removed most of the countries as variables and only focused on India, China and the US. The original dataset consisted of 4422 obs.

# Define the countries and the global entity you want to keep, you can just do the 3 countries here but I added Global just in case it would expand on the visualisations- it did not do much so next time I probably wouldn't include it.
target_countries <- c("India", "China", "United States", "Global")
# Rename the long column name (I renamed mine Mortality_rate)
death_rate <- death_rate %>%
  rename(Mortality_rate = `Death rate from alzheimer disease and other dementias among both sexes`)
# Look at the new names
names(death_rate)
## [1] "Entity"         "Year"           "Mortality_rate"
#Filter the death_rate dataset to include only the target countries and the years 2000 to 2021
# This assumes your 'Country' column holds the name of the entity and your 'Year' column holds the year. Adjust names if needed based on names(death_rate).
filtered_data <- death_rate %>%
  filter(Entity %in% target_countries) %>%
  filter(Year >= 2000 & Year <= 2021)
filtered_data <- na.omit(filtered_data) 
# Look at the final clean structure
head(filtered_data)
## # A tibble: 6 × 3
##   Entity  Year Mortality_rate
##   <chr>  <dbl>          <dbl>
## 1 China   2000           14.7
## 2 China   2001           15.3
## 3 China   2002           16.0
## 4 China   2003           16.7
## 5 China   2004           17.6
## 6 China   2005           18.4
summary(filtered_data)
##     Entity               Year      Mortality_rate  
##  Length:66          Min.   :2000   Min.   : 4.580  
##  Class :character   1st Qu.:2005   1st Qu.: 8.043  
##  Mode  :character   Median :2010   Median :22.600  
##                     Mean   :2010   Mean   :30.998  
##                     3rd Qu.:2016   3rd Qu.:43.532  
##                     Max.   :2021   Max.   :92.960

Your new, clean dataset should consist of 66 obs.

Visualisations

Figure 1: A Line Graph to demonstrate the trend in Alzheimer’s Mortality Rate from 2000-2021

ggplot(filtered_data, aes(x = Year, y = Mortality_rate, colour = Entity)) +
  geom_line(linewidth = 2) +  # Thicker lines
  theme_minimal() +
  labs(
    title = "Alzheimer’s Mortality Rate (2000–2021)",
    x = "Year",
    y = "Deaths per 100,000",
    colour = "Country"
  ) +
  scale_colour_manual(values = c(
    "India" = "lightpink",
    "China" = "purple",
    "United States" = "lightblue"
  )) +
  theme(
    text = element_text(size = 14)  # makes labels easier to read
  )

This line graph demonstrates the mortality rate for Alzheimer’s disease, per 100,000 people between the years 2000 and 2021. It is quite simple to comprehend, the higher up the line, the higher the mortality rate. From first glance, the US has a much higher Alzheimer’s-based mortality rate compared to China and India. The choice of using this graph initially is that it is simple, yet effective.

Figure 3: A Boxplot to demonstrate the distribution of global mortality rates

library(plotly)

plot_ly(filtered_data, 
        x = ~Entity, 
        y = ~Mortality_rate, 
        color = ~Entity, 
        colors = c("India" = "lightpink",
                   "China" = "purple",
                   "United States" = "lightblue"),
        type = "box") %>%
  layout(
    title = "Distribution of Mortality Rates (2000–2021)",
    yaxis = list(title = "Deaths per 100,000")
  )

I also experimented with a boxplot in Figure 3, but I do not think it is as effective as the line graphs, especially if an individual has had no prior experience with visualisations. It could be quite hard to understand, but again it gets the general point across that the US has a higher mortality rate for individuals diagnosed with Alzheimer’s and other dementias.

Figure 4: An interactive heatmap of the mortality rate across target countries

p <- ggplot(filtered_data, aes(x = Year, y = Entity, fill = Mortality_rate)) +
  geom_tile(color = "white") +
  scale_fill_gradient(low = "lightblue", high = "hotpink") +  # light pink → dark purple
  theme_minimal(base_size = 12) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
    axis.text.y = element_text(size = 10),
    legend.position = "right"
  ) +
  labs(
    title = "Heatmap of Alzheimer’s Mortality Rate (2000–2021)",
    x = "Year",
    y = "Country",
    fill = "Deaths per 100k"
  )

ggplotly(p, tooltip = c("x", "y", "fill"))

Honestly, Figure 4 was my own curiosity getting the better of me and wanting to make a more aesthetic visualisation. Figure 4 is a heatmap that separates each country into separate rows to demonstrate how each one has had an increase in mortality rates for Alzheimer’s over 21 years. Once again, we can see that the US changed from blue to pink, showing an increase in the mortality rate, whilst India remained in the lower range as demonstrated by the blue.

library(dplyr)

Summary

Considering at the start of this module I had no previous coding experience, I think this project has allowed me to illustrate just how much I have learnt both in and out of contact hours on the module. I have learnt how to code, first and foremost, but also how to create aesthetically pleasing visualisations that are easy to understand.

If I had more time, and honestly more patience with R Studio, I would compare the mortality rates across all the countries in the original dataset and attempt to produce some visualisations that demonstrate the differences between each country’s mortality rate. I would also like to delve a bit deeper generically and try to understand why some countries have a higher mortality rate linked to Alzheimer’s than others, perhaps focusing on more neural factors or even just environmental influences in different cultures.

(Alzheimer’s Society, 2016)

References

Alzheimer’s Association. (2019). Dementia vs. alzheimer’s disease: What is the difference? Alzheimer’s Association. https://www.alz.org/alzheimers-dementia/difference-between-dementia-and-alzheimer-s

Society, A. (2016). Facebook. Facebook.com. https://www.facebook.com/alzheimerssocietyuk/

World Health Organization. (2023). Global Health Estimates. Www.who.int. https://www.who.int/data/global-health-estimates

World Health Organization . (2025). Death rate from Alzheimer’s. Our World in Data. https://archive.ourworldindata.org/20250909-093708/grapher/death-rate-from-alzheimers-other-dementias-ghe.html?tab=table